2 research outputs found

    Graph based text representation for document clustering

    Get PDF
    Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news group, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results

    Artificial Intelligence Based Deep Bayesian Neural Network (DBNN) Toward Personalized Treatment of Leukemia with Stem Cells

    Get PDF
    The dynamic development of computer and software technology in recent years was accompanied by the expansion and widespread implementation of artificial intelligence (AI) based methods in many aspects of human life. A prominent field where rapid progress was observed are high‐throughput methods in biology that generate big amounts of data that need to be processed and analyzed. Therefore, AI methods are more and more applied in the biomedical field, among others for RNA‐protein binding sites prediction, DNA sequence function prediction, protein‐protein interaction prediction, or biomedical image classification. Stem cells are widely used in biomedical research, e.g., leukemia or other disease studies. Our proposed approach of Deep Bayesian Neural Network (DBNN) for the personalized treatment of leukemia cancer has shown a significant tested accuracy for the model. DBNNs used in this study was able to classify images with accuracy exceeding 98.73%. This study depicts that the DBNN can classify cell cultures only based on unstained light microscope images which allow their further use. Therefore, building a bayesian‐based model to great help during commercial cell culturing, and possibly a first step in the process of creating an automated/semiautomated neural network‐based model for classification of good and bad quality cultures when images of such will be available
    corecore